NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Dynamically Balancing Load with Overload Control for Microservices

https://doi.org/10.1145/3676167

Bhattacharya, Ratnadeep; Gao, Yuan; Wood, Timothy (December 2024, ACM Transactions on Autonomous and Adaptive Systems)

The microservices architecture simplifies application development by breaking monolithic applications into manageable microservices. However, this distributed microservice “service mesh” leads to new challenges due to the more complex application topology. Particularly, each service component scales up and down independently creating load imbalance problems on shared backend services accessed by multiple components. Traditional load balancing algorithms do not port over well to a distributed microservice architecture where load balancers are deployed client-side. In this article, we propose a self-managing load balancing system, BLOC, which provides consistent response times to users without using a centralized metadata store or explicit messaging between nodes. BLOC uses overload control approaches to provide feedback to the load balancers. We show that this performs significantly better in solving the incast problem in microservice architectures. A critical component of BLOC is the dynamic capacity estimation algorithm. We show that a well-tuned capacity estimate can outperform even join-the-shortest-queue, a nearly optimal algorithm, while a reasonable dynamic estimate still outperforms Least Connection, a distributed implementation of join-the-shortest-queue. Evaluating this framework, we found that BLOC improves the response time distribution range, between the 10th and 90th percentiles, by 2 –4 times and the tail, 99th percentile, latency by 2 times.
more » « less
Full Text Available
Load Balancing and Generalized Split State Reconciliation in Event Driven Systems

https://doi.org/10.1109/ACSOS61780.2024.00031

Bhattacharya, Ratnadeep; Wood, Timothy (September 2024, IEEE)

Event driven applications are often built with message queuing systems that provide no temporal upper bound on message delivery. However, many modern event driven applications, like a system inferring traffic conditions and generating recommendations to road users based on sensor data, are latency sensitive. Traditional message queuing systems use static load assignment algorithms that guarantee event ordering while mostly ignoring a temporal upper bound on message delivery. Another class of message queuing systems use stateless operators which deliver messages (events) quickly but pass the burden of stream state management to user applications. Synchronous communication patterns, on the other hand, provide an upper bound for message delivery while ensuring message ordering but unnecessarily bind limited resources reducing efficiency. In this paper we explore load balancing choices in asynchronous systems and their impact on queuing delay. We then propose a load balancing framework, SMALOPS, for event driven applications with dynamically changing load and quick message delivery requirements. Our experiments confirm that with smarter load balancing, the \mathbf9 9 % ile response times for events can be improved by as much as 73 %, compared to traditional message queuing systems. SMALOPS introduces the following:•A load balancing algorithm that can significantly reduce queuing delay in message delivery systems.•Mechanisms enabling consumers to recover stream state when either the message delivery system does not support stateful operators or the state has been split by moving streams between operators.
more » « less
Full Text Available
Mu: An Efficient, Fair and Responsive Serverless Framework for Resource-Constrained Edge Clouds

https://doi.org/10.1145/3472883.3487014

Mittal, Viyom; Qi, Shixiong; Bhattacharya, Ratnadeep; Lyu, Xiaosu; Li, Junfeng; Kulkarni, Sameer G.; Li, Dan; Hwang, Jinho; Ramakrishnan, K. K.; Wood, Timothy (November 2021, Proceedings of the ACM Symposium on Cloud Computing)

Serverless computing platforms simplify development, deployment, and automated management of modular software functions. However, existing serverless platforms typically assume an over-provisioned cloud, making them a poor fit for Edge Computing environments where resources are scarce. In this paper we propose a redesigned serverless platform that comprehensively tackles the key challenges for serverless functions in a resource constrained Edge Cloud. Our Mu platform cleanly integrates the core resource management components of a serverless platform: autoscaling, load balancing, and placement. Each worker node in Mu transparently propagates metrics such as service rate and queue length in response headers, feeding this information to the load balancing system so that it can better route requests, and to our autoscaler to anticipate workload fluctuations and proactively meet SLOs. Data from the Autoscaler is then used by the placement engine to account for heterogeneity and fairness across competing functions, ensuring overall resource efficiency, and minimizing resource fragmentation. We implement our design as a set of extensions to the Knative serverless platform and demonstrate its improvements in terms of resource efficiency, fairness, and response time. Evaluating Mu, shows that it improves fairness by more than 2x over the default Kubernetes placement engine, improves 99th percentile response times by 62% through better load balancing, reduces SLO violations and resource consumption by pro-active and precise autoscaling. Mu reduces the average number of pods required by more than ~15% for a set of real Azure workloads.
more » « less
Full Text Available

Search for: All records